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1 Introduction 

Concurrent ML (CML) is a high-level, high-performance language for concurrent 
programming. It is an extension of Standard ML (SML) [MTH90], and is imple- 
mented on top of Standard ML of New Jersey (SML/NJ) [AM87], CML is a 
practical language and is being used to build real systems. It demonstrates that we 
need not sacrifice high-level notation in order to have good performance. CML is 
also a well-defined language. In the tradition of SML, it has a formal semantics and 
its type-soundness has been proven. 

Although most research in the area of concurrent language design has been moti- 
vated by the desire to improve performance by exploiting multiprocessors, I believe 
that concurrency is also a useful programming paradigm for certain application do- 
mains. For example, interactive systems often have a naturally concurrent structure 
[CP85, RG86, Pik89, Haa90, GR92]. Another example is distributed systems: most 
systems for distributed programming provide multi-threading at the node level (e.g., 
Isis [BCJ+90] and Argus [LCJS87]). Sequential programs in these application do- 
mains often must use complex and artificial control structures to schedule and inter- 
leave activities (e.g., event-loops in graphics libraries). They are, in effect, simulating 
concurrency. These application domains need a high-level concurrent language that 
provides both efficient sequential execution and efficient concurrent execution: CML 
satisfies this need. 


1.1 An Overview of CML 

CML is based on the sequential language SML [MTH90] and inherits the useful 
features of SML: functions as first-class values, strong static typing, polymorphism, 
datatypes and pattern matching, lexical scoping, exception handling and a state- 
of-the-art module facility. An introduction to SML can be found elsewhere in this 
volume [Oph92]; also see [Pau91] or [Har86]. The sequential performance of CML 
benefits from the quality of the SML/NJ compiler [AM87]. In addition CML has 
the following properties: 
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— CML provides a high-level model of concurrency with dynamic creation of 
threads and typed channels, and rendezvous communication. This distributed- 
memory model fits well with the mostly applicative style of SML. 

- CML is a higher-order concurrent language. Just as SML supports functions 
as first-class values, CML supports synchronous operations as first-class values 
[Rep88, Rep91a, Rep92]. These values, called events, provide the tools for build- 
ing new synchronization abstractions. This is the most significant characteristic 

of CML. 

— CML provides integrated I/O support. Potentially blocking I/O operations, 
such as reading from an input stream, are full-fledged synchronous operations. 
Low-level support is also provided, from which distributed communication ab- 
stractions can be constructed. 

— CML provides automatic reclamation of threads and channels, once they become 
inaccessible. This permits a technique of speculative communication, which is not 
possible in other threads packages. 

- CML uses preemptive scheduling. To guarantee interactive responsiveness, a 
single thread cannot be allowed to monopolize the processor. Pre-emption insures 
that a context switch will occur at regular intervals, which allows “off-the-shelf” 
code to be incorporated in a concurrent thread without destroying interactive 
responsiveness. 

- CML is efficient. Thread creation, thread switching and message passing are 
very efficient (benchmarks results are reported in [Rep92]). Experience with 
CML has shown that it is a viable language for implementing usable interactive 
systems [GR91]. 

- CML is portable. It is written in SML and runs on essentially every system sup- 
ported by SML/NJ (currently four different architectures and many different 
operating systems). 

— CML has a formal foundation. Following the tradition of SML [MTH90, MT91], 
a formal semantics has been developed for the concurrency primitives of CML. 


1.2 Organization of this Paper 


This paper is organized into three main parts. The first, consisting of Sections 2 
and 3, describes the design and rationale of CML. The second part focuses on 
the practical aspects: Section 4 describes the use of CML in real applications, and 
Section 5 briefly discusses the implementation of CML. The third part presents 
the formal underpinnings of CML: Section 6 gives the dynamic semantics a small 
concurrent language that supports the core CML concurrency mechanisms, and 
then Section 7 gives a static semantics (i.e. , a type system) for this language and 
presents type-soundness results. The paper concludes with a discussion of related 
work and a summary. 



2 Basic Concurrency Primitives 


We start with a discussion of the basic concurrency operations provided by CML. 
A running CML program consists of a collection of threads, which use synchronous 
message passing on typed channels to communicate and synchronize. In keeping with 
the flavor of SML, both threads and channels are created dynamically (initially, a 
program consists of a single thread). The signature of the basic thread and channel 
operations is given in Figure 1. The function spawn takes a function as an argument 


val 

spawn : 

(unit -> unit) 

-> thread_id 

val 

channel 

: unit -> ’la 

chan 

val 

accept 

: ’a chan -> ’ 

a 

val 

send 

: ( ’ a chan * ’ 

a) -> unit 


Fig. 1. Basic concurrency primitives 


and creates a new thread to evaluate the application of the function to the unit value. 
Channels are also created dynamically using the function channel, which is weakly 
polymorphic. 2 The functions accept and send are the synchronous communication 
operations. When a thread wants to communicate on a channel, it must rendezvous 
with another thread that wants to do a complementary communication on the same 
channel. SML’s lexical scoping is used to share channels between threads, and to 
hide channels from other threads (note, however, that channels can be passed as 
messages). 

Most CSP-style languages (e.g., Occam [Bur88] and amber [Car86]) provide 
similar rendezvous-style communication. In addition, they provide a mechanism for 
selective communication, which is necessary for threads to communicate with multi- 
ple partners. It is possible to use polling to implement selective communication, but 
to do so is awkward and requires busy waiting. Usually selective communication is 
provided as a multiplexed I/O operation. This can be a multiplexed input operation, 
such as Occam’s ALT construct [Bur88]; or a generalized (or symmetric) select oper- 
ation that multiplexes both input and output communications, such as Pascal-m’s 
select construct [AB86]. Implementing generalized select on a multi-processor can 
be difficult [Bor86], but there are situations in which generalized selective commu- 
nication is necessary (an example of this is given in Section 3.1). 

Unfortunately, there is a fundamental conflict between the desire for abstraction 
and the need for selective communication. For example, consider a server thread 

2 The “1” in the type variable of channel’s result type is the strength of the variable. This 
is a technical mechanism used to allow polymorphic use of updatable objects without 
creating type loopholes. 



that provides a service via a request-reply (or remote procedure call (RPC) style) 
protocol. The server side of this protocol is something like: 

fun serverLoop () = if serviceAvailable () 
then let 

val request = accept reqCh 
in 

send (replyCh, doit request) ; 
serverLoop () 
end 

else doSomethingElse () 

where the function doit actually implements the service. Note that the service is 
not always available. This protocol requires that clients obey the following two rules: 

1. A client must send a request before trying to read a reply. 

2. Following a request the client must read exactly one reply before issuing another 
request. 

If all clients obey these rules, then we can guarantee that each request is answered 
with the correct reply, but if a client breaks one of these rules, then the requests and 
replies will be out of sync. An obvious way to improve the reliability of programs 
that use this service is to bundle the client-side protocol into a function that hides 
the details, thus ensuring that the rules are followed. The following code implements 
this abstraction: 

fun clientCall x = (send(reqCh, x) ; accept replyCh) 

While this insures that clients obey the protocol, it hides too much. If a client blocks 
on a call to clientCall (e.g., if the server is not available), then it cannot respond 
to other communications. Avoiding this situation requires using selective commu- 
nication, but the client cannot do this because the function abstraction hides the 
synchronous aspect of the protocol. This is the fundamental conflict between selec- 
tive communication and the existing forms of abstraction. If we make the operation 
abstract, we lose the flexibility of selective communication; but if we expose the pro- 
tocol to allow selective communication, we lose the safety and ease of maintenance 
provided by abstraction. The next section describes our solution to this dilemma. 


3 First-class Synchronous Operations 

To resolve the conflict between abstraction and selective communication requires in- 
troducing a new abstraction mechanism that preserves the synchronous nature of the 
abstraction. First-class synchronous operations provide this abstraction mechanism 
[Rep88, Rep89, Rep91a]. 

The traditional select construct has four facets: the individual I/O operations, 
the actions associated with each operation, the nondeterministic choice, and the 



synchronization. In CML, we unbundle these facets by introducing a new type 
of values, called events, that represent synchronous operations. By starting with 
base-event values to represent the communication operations, and providing com- 
binators to associate actions with events and to build nondeterministic choices of 
events, a flexible mechanism for building new synchronization and communication 
abstractions is realized. Event values provide a mechanism for building an abstract 
representation of a protocol without obscuring its synchronous aspect. 

To make this concrete, consider the following loop (using an Amber style select 
construct [Car86]), which implements the body of an accumulator that accepts either 
addition or subtraction input commands and offers its contents: 

fun accum sum = ( 

select addCh?x => accum(sum+x) 
or subCh?x => accum(sum-x) 
or readChlsum => accum sum) 

The select construct consists of three I/O operations: addCh?x, subCh?x, and 
readChlsum. For each of these operations there is an associated action on the right 
hand side of the =>. Taken together, each I/O operation and associated action define 
a clause in a nondeterministic synchronous choice. It is also worth noting that the 
input clauses define a scope: the input operation binds an identifier to the incoming 
message, which has the action as its scope. 

Figure 2 gives the signature of the event operations corresponding to the four 
facets of generalized selective communication. The functions receive and transmit 


val receive : ’a chan -> ’a event 

val transmit : (’a chan * ’a) -> unit event 


val choose 
val wrap 


’a event list -> ’a event 

(’a event * (’a -> ’ b) ) -> ’b event 


val sync 


’a event -> ’a 


Fig. 2. Basic event operations 


build base-event values that represent channel I/O operations. The wrap combinator 
binds an action, represented by a function, to an event value. And the choose 
combinator composes event values into a nondeterministic choice. The last operation 
is sync, which forces synchronization on an event value. I call this set of operations 
“PML events,” since they constitute the mechanism that I originally developed in 
PML [Rep88]. 

The simplest example of events is the implementation of the synchronous channel 
I/O operations that were described in the previous section. These are defined using 
function composition, sync and the channel I/O event- value constructors: 



val accept = sync o receive 
val send = sync o transmit 


A more substantial example is the accumulator loop from above, which is imple- 
mented as: 

fun accum sum = sync ( 
choose [ 

wrap (receive addCh, fn x => accum (sum+x)), 
wrap (receive subCh, fn x => accum (sum-x)), 
wrap (transmit (readCh, sum) , fn () => accum sum) 

]) 

Notice how wrap is used to associate actions with communications. 

The great benefit of this approach to concurrency is that it allows the program- 
mer to create new first-class synchronization and communication abstractions. For 
example, we can define an event-valued function that implements the client-side of 
the RPC protocol given in the previous section as follows: 

fun clientCallEvt x = wrap (transmit (reqCh, x) , fn () => accept replyCh) 

Applying clientCallEvt to a value v does not actually send a request to the server, 
rather it returns an event value that can be used to send v to the server and then 
accept the server’s reply. This event value can be used in a choose expression with 
other communications; in which case the transmit base-event value is used in select- 
ing the event. This example shows that we can use first-class synchronous operations 
to abstract away from the details of the client-server protocol, without hiding the 
synchronous nature of the protocol. 

This approach to synchronization and communication leads to a new program- 
ming paradigm, which I call higher-order concurrent programming. To understand 
this the higher-order nature of this mechanism, it is helpful to draw an analogy with 
first-class function values. Table 1 compares these two higher-order mechanisms. 


Table 1. Relating first-class functions and events 


Property 

Function values 

Event values 

Type constructor 

-> 

event 

Introduction 

A-abstraction 

receive 

transmit 

etc. 

Elimination 

application 

sync 

Combinators 

composition 

choose 


map 

wrap 


etc. 

etc. 


















3.1 An Example 


An example that illustrates a number of key points is an implementation of a buffered 
channel abstraction. Buffered channels provide a mechanism for asynchronous com- 
munication, which is similar to the actor mailbox [Agh86]. The source code for 
this abstraction is given in Figure 3. The function buffer creates a new buffered 


abstype ’a buffer_chan = BC of { 
inch : ’a chan, 
outch : ’a chan 

} 

with 

fun buffer () = let 

val inCh = channel () and outCh = channel () 
fun loop ( [] , [] ) = loop ([accept inCh] , [] ) 

I loop (front as (x::r), rear) = sync ( 
choose [ 

wrap (receive inCh, fn y => loop(front, y::rear)), 
wrap (transmit (outCh, x) , fn () => loop(r, rear)) 

]) 

I loop ( [] , rear) = loop (rev rear, [] ) 
in 

spawn (fn () => loop([], [] ) ) ; 

BC{inch=inCh, outch=outCh} 
end 

fun bufferSend (BCfinch, x) = send(inch, x) 

fun buf f erReceive (BCfoutch, ...}) = receive outch 
end (* abstype *) 


Fig. 3. Buffered channels 


channel, which consists of a buffer thread, an input channel and an output chan- 
nel; the function bufferSend is an asynchronous send operation; and the function 
buf f erReceive is an event-valued receive operation. The buffer is represented as a 
queue of messages, which is implemented as a pair of stacks (lists). This example 
illustrates several key points: 

— Buffered channels are a new communication abstraction, which have first-class 
citizenship. A thread can use the buf f erReceive function in any context that 
it could use the built-in function receive, such as selective communication. 

— The buffer loop uses both input and output operations in its selective commu- 
nication. This is an example of the necessity of generalized selective communi- 
cation. If we have only a multiplexed input construct (e.g., Occam’s ALT), then 
we must to use a request/reply protocol to implement the server side of the 
buf f erReceive operation (see pp. 37-41 of [Bur88], for example). But if a re- 



quest/reply protocol is used, then the buf f erReceive operation cannot be used 
in a selective communication by the client. 

— The buffer thread is a good example of a common CML programming idiom: 
using threads to encapsulate state. This style has the additional benefit of hiding 
the state of the system in the concurrency operations, which makes the sequential 
code cleaner. These threads serve the same role that monitors do in some shared- 
memory concurrent languages. 

— This implementation exploits the fact that unreachable blocked threads are 
garbage collected. If the clients of this buffer discard it, then the buffer thread 
and channels will be reclaimed by the garbage collector. This improves the mod- 
ularity of the abstraction, since clients do not have to worry about explicit 
termination of the buffer thread. 


3.2 Other Synchronous Operations 

The event type provides a natural framework for accommodating other primitive 
synchronous operations. 3 There are three examples of this in CML: synchroniza- 
tion on thread termination (sometimes called process join), low-level I/O support 
and time-outs. Figure 4 gives the signature of the CML base-event constructors for 
these other synchronous operations. The function wait produces an event for syn- 


val wait : thread_id -> unit event 

val syncOnlnput : int -> unit event 
val syncOnOutput : int -> unit event 

val waitUntil : time -> unit event 
val timeout : time -> unit event 


Fig. 4. Other primitive synchronous operations 


chronizing on the termination of another thread. This is often used by servers that 
need to release resources allocated to a client in the case that the client terminates 
unexpectedly. Support for low-level I/O is provided by the functions syncOnlnput 
and syncOnOutput, which allow threads to synchronize on the status of file descrip- 
tors [UNI86]. These operations are used in CML to implement a multi-threaded I/O 
stream library. There are two functions for synchronizing with the clock: waitUntil 
and timeout. The function waitUntil returns an event that synchronizes on an ab- 
solute time, while timeout implements a relative delay. The function timeout can 
be used to implement a timeout in a choice. The following code, for example, defines 
an event that waits for up to a second for a message on a channel: 

3 This is the reason that the I use the term “event” to refer to first-class synchronous 
operations instead of using “communication.” 



choose [ 

wrap (receive ch, SOME) , 

wrap (timeout (TIME{sec=l , usec=0}) , fn () => NONE) 

] 

By having a uniform mechanism for combining synchronous operations, CML pro- 
vides a great deal of flexibility with a fairly terse mechanism. As a comparison, 
Ada has two different timeout mechanisms: a time entry call for clients and delay 
statement that servers can include in a select. 

CML also provides a polling mechanism. The operation 
val poll : ’a event -> ’a option 

is a non-blocking form of the sync operator. It returns NONE is the case where sync 
would block. 


3.3 Extending PML events 

Thus far, I have described the PML subset of first-class synchronous operations. In 
this section, I motivate and describe two significant extensions to PML events that 
are provided in CML. 

Consider a protocol consisting of a sequence of communications: ci; C 2 ; • • • ; c n . 
When this protocol is packaged up in an event value, one of the c; is designated as 
the commit point , the communication by which this event is chosen in a selective 
communication (e.g., the message send operation in the clientCallEvt abstraction 
above). In PML events, the only possible commit point is ci. The wrap construct 
allows one to tack on C 2 ; • • • ; c n after ci is chosen, but there is no way to make any 
of the other Ci the commit point. This asymmetry is a serious limitation. 

A good illustration of this problem is a server that implements an input-stream 
abstraction. Since this abstraction should be smoothly integrated into the concur- 
rency model, the input operations should be event-valued. For example, the function 

val input : instream -> string event 

is used to read a single character. In addition, there are other input operations such 
as input_line. Let us assume that the implementation of these operations uses a 
request-reply protocol; thus, a successful input operation involves the communica- 
tion sequence 

send (ch r eq , REQ_INPUT) ; accept (ch rep i y ) 

Packaging this up as an event (as we did in Section 3) will make the send communi- 
cation be the commit point, which is the wrong semantics. To illustrate the problem 
with this, consider the case where a client thread wants to synchronize on the choice 
of reading a character and a five second timeout: 



sync (choose [ 

wrap (timeout (TIME{sec=5 , usec=0}) , fn () => raise Timeout), 
input instream 

]) 

The server might accept the request within the five second limit, even though the 
wait for input might be indefinite. The right semantics for the input operation 
requires making the accept be the commit point, which is not possible using only 
the PML subset of events. To address this limitation, CML provides the guard 
combinator. 


Guards. The guard combinator is the dual of wrap; it bundles code to be executed 
before the commit point; this code can include communications. It has the type 

val guard : (unit -> ’a event) -> ’a event 

A guard event is essentially a suspension that is forced when sync is applied to it. As 
a simple example of the use of guard, the timeout function, described in Section 3.2, 
is actually implemented using waitUntil and a guard: 

fun timeout t = guard (fn () => waitUntil (add_time (t , currentTime () ) ) 

where currentTime returns the current time. 

Some languages support guarded clauses in selective communication, where the 
guards are boolean expressions that must evaluate to true in order that the commu- 
nication be enabled. CML guards can be used for this purpose too, as illustrated 
by the following code skeleton: 

sync (choose [ 

guard (fn () => if pred then evt else choose []) 


]) 

Here evt is part of the choice only if pred evaluates to true. Note that the evaluation 
of pred occurs each time the guard function is evaluated (i.e. , each time the sync is 
applied). 

Returning to the RPC example from above, we can now build an abstract RPC 
operation with the reply as the commit point. The two different versions are: 

fun clientCallEvtl x = wrap (transmit (reqCh, x) , fn () => accept replyCh) 

fun clientCallEvt2 x = guard (fn () => (send(reqCh, x) ; receive replyCh) 

where the clientCallEvtl version commits on the server’s acceptance of the re- 
quest, while the clientCallEvt2 version commits on the server’s reply to the re- 
quest. Note the duality of guard and wrap with respect to the commit point. Using 



guards to generate requests in this way raises a couple of other problems. First of all, 
if the server cannot guarantee that requests will be accepted promptly, then evalu- 
ating the guard may cause delays. The solution to this is to spawn a new thread to 
issue the request asynchronously: 

fun clientCallEvt3 x = guard (fn () => ( 
spawn(fn () => send(reqCh, x)); 
receive replyCh) 

Another alternative is for the server to be a clearing-house for requests; spawning a 
new thread to handle each new request. 

The other problem is more serious: what if this RPC event is used in a selective 
communication and some other event is chosen? How does the server avoid blocking 
forever on sending a reply? For idempotent services, this can be handled by having 
the client create a dedicated channel for the reply and having the server spawn a 
new thread to send the reply. The client side of this protocol is 

fun clientCallEvt4 x = guard (fn () => let 
val replyCh = channel () 

In 

spawn (fn () => send(reqCh, (replyCh, x))); 
receive replyCh 
end) 

When the server sends the reply it evaluates 
spawn (fn () => send (replyCh, reply)) 

If the client has already chosen a different event, then this thread blocks and will be 
garbage collected. For services that are not idempotent, this scheme is not sufficient; 
the server needs a way to abort the transaction. The wrapAbort combinator provides 
this mechanism. 


Abort actions. The wrapAbort combinator associates an abort action with an 
event value. The semantics are that if the event is not chosen in a sync opera- 
tion, then a new thread is spawned to evaluate the abort action. The type of this 
combinator is: 

val wrapAbort : (’a event * (unit -> unit)) -> unit 

where the second argument is the abort action. This combinator is the complement 
of wrap in the sense that if you view every base event in a choice as having both a 
wrapper and an abort action, then, when sync is applied, the wrapper of the chosen 
event is called and threads are spawned for each of the abort actions of the other 
base events. 

Using wrapAbort, we can now implement the RPC protocol for non-idempotent 
services. The client code for the RPC using abort must allocate two channels; one 
for the reply and one for the abort message: 



fun clientCallEvt5 x = guard (fn () => let 
val replyCh = channel () 
val abortCh = channel () 
fun abortFn () = send (abortCh, ()) 

In 

spawn(fn () => send (reqCh, (replyCh, abortCh, x))); 
wrapAbort (receive replyCh, abortFn) 
end) 

When the server is ready to reply (i.e. , commit the transaction), it synchronizes on 
the following event value: 

choose [ 

wrap (receive abortCh, fn () => abort the transaction) , 

wrap (transmit (replyCh, reply) , fn () => commit the transaction) 

] 

This mechanism can be used to implement the input-stream abstraction discussed at 
the beginning of this section, and in fact, the concurrent stream I/O library provided 
by CML is implemented in this way. 


3.4 Stream I/O 

CML provides a concurrent version of the SML stream I/O primitives. Input op- 
erations in this version are event valued, which allows them to be used in selective 
communication. For example, a program might want to give a user 60 seconds to 
supply a password. This can be programmed as: 

fun getpasswd () = sync (choose [ 

wrap (timeout{sec=60 , usec=0}, fn () => NONE), 
wrap (input_line std_in, SOME) 

]) 

This will return NONE, if the user fails to respond within 60 seconds, otherwise it 
wraps SOME around the user’s response. Streams are implemented as threads which 
handle buffering. The input operations are actually request /reply /abort protocols, 
similar to the one discussed above. 


4 Applications 

CML is more than an exercise in language design; it is intended to be a useful 
tool for building large systems. I have implemented CML on top of SML/NJ. 
This implementation has been used by a number of people, including myself, for 
various different applications. This practical experience demonstrates the validity 
and usefulness of the design as well as the efficiency of the implementation. In this 
section, I describe two applications of CML, and how they use the features of CML. 



4.1 Interactive Systems 


Providing a better foundation for programming interactive systems, such as pro- 
gramming environments, was the original motivation for this line of research [RG86]. 
Because of their naturally concurrent structure, interactive systems are one of the 
most important application areas for CML. Concurrency arises in several ways in 
interactive systems: 

User interaction. Handling user input is the most complex aspect of an interactive 
program. Most interactive systems use an event-loop and call-back functions. 
The event-loop receives input events (e.g., mouse clicks) and passes them to 
the appropriate event-handler. This structure is a poor-man’s concurrency: the 
event-handlers are coroutines and the event-loop is the scheduler. 

Multiple services. For example, consider a document preparation system that 
provides both editing and formatting. These two services are independent and 
can be naturally organized as two separate threads. Multiple views are imple- 
mented by replicating the threads. 

Interleaving computation. A user of a document preparation system may want 
to edit one part of a document while another part is being formatted. Since 
formatting may take a significant amount of time, providing a responsive inter- 
face requires interleaving formatting and editing. If the editor and formatter are 
separate threads, then interleaving comes for free. 

Output driven applications. Most windowing toolkits, for example Xlib [Nye90], 
provide an input oriented model, in which the application code is occasionally 
called in response to some external event. But many applications are output 
oriented. Consider, for example, a computationally intensive simulation with a 
graphical display of the current state of the simulation. This application must 
monitor window events, such as refresh and resize notifications, so that it can re- 
draw itself when necessary. In a sequential implementation, the handling of these 
events must be postponed until the simulation is ready to update the displayed 
information. By separating the display code and simulation code into separate 
threads, the handling of asynchronous redrawing is easy. 

The root cause of these forms of concurrency is computer-human interaction: humans 
are asynchronous and slow. 

CML has been used to build a multi-threaded interface to the X protocol [SG86], 
called eXene [GR91]. This system provides a substantially different, and we think 
better, model of user interaction than the traditional Xlib model [Nye90]. Windows 
in eXene have an environment, consisting of three streams of input from the win- 
dow’s parent (mouse, keyboard and control), and one output stream for requesting 
services from the window’s parent. For each child of the window, there will be corre- 
sponding output streams and an input stream. The input streams are represented by 
event values and the output streams by event valued functions. A window is respon- 
sible for routing messages to its children, but this can almost always be done using a 
generic router function provided by eXene. Typically, each window has a separate 
thread for each input stream as well as a thread, or two, for managing state and 



coordinating the other threads. By breaking the code up this way, each individual 
thread is quite simple. This model is similar to those of [Pik89] and [Haa90]). 

This structure allows us to use delegation techniques to define new behavior from 
existing implementations. Delegation is an object-oriented technique that originated 
in concurrent actor systems [Lie86]. As an example, consider the case of adding a 
menu to an existing text window. We can do this in a general way by defining a 
wrapper that takes a window’s environment and returns a new, wrapped, environ- 
ment. The wrapped environment has a thread monitoring the mouse stream of the 
old environment. Normally, this thread just passes messages along to the wrapped 
window, but when a mouse “button down” message comes along, the thread pops up 
the menu and consumes mouse messages until an item is chosen. Emden Gansner, of 
AT&T Bell Laboratories, has developed a “widget” toolkit on top of eXene, which 
uses these delegation techniques heavily. 

The implementation of eXene, which is currently about 8,500 lines of CML 
code, uses threads heavily. At the lowest level, threads are used to buffer communi- 
cation with the X server. There are threads to manage shared state, such as graphics 
contexts, fonts and keycode translation tables. Because the internal threads are fairly 
specialized and tightly integrated, there is not much use of events as an abstraction 
mechanism. 

The use of events as an abstraction mechanism is common at the application 
programmer’s level. In addition to the event-valued interface of the window environ- 
ments, there are higher-level objects that have abstract synchronous interfaces. One 
example is a virtual terminal window (vtty). This provides a synchronous stream 
interface to its clients, which is compatible with the signature of CML’s concurrent 
I/O library. If the client-code is implemented as a functor [Mac84] (parameterized 
module), then it can be used with either the concurrent I/O library or the vtty 
abstraction. 

The vtty abstraction is a good example of where user-defined abstract syn- 
chronous operations are necessary for program modularity. At any time, the vtty 
thread must be ready to receive input from the user and output from the applica- 
tion; thus it needs selective communication. The underlying window toolkit (eXene) 
provides an abstract interface to the input stream, but, since it is event-valued, it 
can be used in the selective communication. 

Another example of the use of new communication abstractions is a buffered 
multicast channel (a simple version is described in [Rep90b]). This abstraction has 
proven quite useful in supporting multiple views of an object. When the viewed 
object is updated, the thread managing its state sends a notification on the multicast 
channel. The multicast channel basically serves the role of a call-back function, while 
freeing the viewed object from the details on managing multiple views. All of the 
details of creating/destroying views and distributing messages are taken care of by 
the multicast abstraction. 



4.2 Distributed systems programming 

Many distributed programming languages have concurrent languages at their core 
(e.g., SR [AOCE88]), and distributed programming toolkits often include thread 
packages (e.g., Isis [BCJ+90]). This is because threads provide a needed flexibility for 
dealing with the asynchronous nature of distributed systems. The flexibility provided 
by CML is a good base for distributed programming. Its support for low-level I/O is 
sufficient to build a structured synchronous interface to network communication (as 
was done in eXene). Higher-level linguistic support for distributed programming, 
such as the promise mechanism described in [LS88], can be built using events to 
define the new abstractions. 4 

Another example is Chet Murthy’s reimplementation of the Nuprl environment 
[Con86] using CML. His implementation is structured as a collection of “proof 
servers” running on different workstations. When an expensive operation on a proof 
tree is required, it can be decomposed and run in parallel on several different work- 
stations. This system uses CML to manage the interactions between the different 
workstations. 

Another project involving CML is the development of a distributed programming 
toolkit for ML that is being developed at Cornell University [Kru91, CK92, Kru92]. 
This work builds on the mechanisms prototyped in Murthy’s distributed Nuprl 
and on the protocols developed for Isis [BCJ+90]. A new abstraction, called a port 
group has been developed to model distributed communication. The communication 
operations provided by port groups are represented by event-value constructors (for 
details see [Kru91]). 


4.3 Other applications of CML 

CML has been used by various people for a number of other purposes. Andrew 
Appel has used it to teach concurrent programming to undergraduates at Princeton 
University (Appel, personal communication, January 1991). Gary Lindstrom and Lai 
George have used it to experiment with functional control of imperative programs 
for parallel programming [GL91]. And Clement Pellerin has implemented a compiler 
from a concurrent constraint language to CML [Pel92]. 


5 Implementation 

CML is written entirely in SML, using a couple of non-standard extensions provided 
by SML/NJ: first-class continuations [DHM91] and asynchronous signals [Rep90a]. 
We added one minor primitive operation to the compiler (a ten line change in a 30,000 
line compiler), which was necessary to guarantee that tail-recursion is preserved by 
sync. Threads are implemented using first-class continuations (a technique owed to 


4 See [GR91] or [Rep92] for a description of the implementation of promises in CML. 



Wand [Wan80]), and the SML/NJ asynchronous signal facility is used to implement 
preemptive scheduling. 

Unlike other continuation-passing style compilers, such as Rabbit [Ste78] and 
Orbit [KKR+86], the code generated by the SML/NJ compiler does not use a 
run-time stack [App92]. This means that callcc and throw are constant-time op- 
erations. While this is possible using a stack [HDB90]; heap-based implementations 
are better suited for implementing light-weight threads (Haahr’s experience bears 
this out [Haa90]). 

Event values have a natural implementation in terms of first-class continuations. 
Without the choose operator, an event value could be represented as 

type ’a event = ’a cont -> ’a 

with sync being directly implemented by callcc. This representation captures the 
intuition that an event is just a synchronous operation with its synchronization point 
continuation as a free variable. The choose operator requires polling, since we need 
to see which (if any) base events are immediately available for synchronization. Thus, 
the implementation of an event value is a list of base events, with each base event 
represented by a polling function, a function to call for immediate synchronization 
and a function for blocking. The implementation of CML is described in detail, 
including detailed performance measurements, in the author’s dissertation [Rep92]. 


6 Dynamic Semantics 

In this section, I present the dynamic semantics of X cv , a concurrent extension of 
Plotkin’s calculus [Plo75]. Although X cv lacks many of the features of CML, 
it contains the core of the concurrency primitives, including first-class synchronous 
operations, and the various event- value combinators. A discussion of how X cv can be 
extended to include many of the missing features of CML can be found in [Rep92]. 

The dynamic semantics of X cv is defined by two evaluation relations: a sequential 
evaluation relation “i — and a concurrent evaluation relation “=$■” where con- 
current evaluation is an extension of sequential evaluation to finite sets of processes. 


6.1 The Syntax of X cv 

We start with disjoint sets of variables, function constants, base constants and chan- 
nel names, which comprise the ground terms of X cv : 

x £ Var variables 

b £ Const = BConst U FConst constants 

BConst = {(), true, false, 0, 1, . . .} base constants 
FConst = {+, -, f st, snd, . . .} function constants 

k £ Ch channel names 



The set FConst includes the following event-valued combinators and constructors: 

choose, guard, never, receive, transmit, wrap, wrapAbort 

There are three syntactic classes of terms in X cv : 

e £ Exp expressions 

v £ Val C Exp values 
ev £ Event C Val event values 

where values are the irreducible terms in the dynamic semantics. The terms of X cv are 
defined by the grammar in Figure 5. Pairs have been included to make the handling 


V 

value 

ei e 2 

application 

(ei ,e 2 ) 

pair 

let x = e\ in e 2 

let 

chan x in e 

channel creation 

spawn e 

process creation 

sync e 

synchronization 

b 

constant 

X 

variable 

(vi ,v 2 ) 

pair value 

Xx (e) 

A-abstraction 

K 

channel name 

ev 

event value 

(Ge) 

guarded event function 

A 

never 

k!v 

channel output 

K? 

channel input 

(ev =>• v ) 

wrapper 

(ev! 0 ev 2 ) 

choice 

(ev | v) 

abort wrapper 


Fig. 5. Grammar for A c „ 


of two-argument functions easier. Note that the syntactic class of the term (rq . rq ) 
is either Exp or Val; this ambiguity is resolved in favor of Val. The value A is a 
base event value that is never matched (equivalent to choose [] in CML). There are 
three binding forms in this term language: let binding, A-abstraction and channel 
creation. Unlike CML, new channels are introduced by the special binding form for 
channel creation. This is done to simplify the presentation of the next chapter, and 
the channel function of CML can be easily defined in terms of X cv . The set Val 0 
is the set of closed value terms (i.e. , those without free variables); note, however, 
that closed values may contain free channel names. The free channel names of an 



expression e are denoted by FCN(e). Note that, since there are no channel name 
binding forms, FCN(e) is exactly the set of channel names that appear in e. There 
is no term for sequencing, but we write “(ei; e 2 )” for “snd (ei.e 2 ),” which, since 
the language is call-by-value, has the desired semantics. 

Channel names and event values are not part of the concrete syntax of the lan- 
guage; rather, they appear as the intermediate results of evaluation. A program is a 
closed term, which does not contain any guarded event functions (i.e. , (Ge) terms), 
or any subterms in the syntactic classes Event or Ch. In other words, programs do 
not contain intermediate values. 


6.2 Sequential Evaluation 

There are a number of different ways to specify the dynamic semantics of program- 
ming languages. I use the style of operational semantics developed by Felleisen and 
Friedman [FF86], because it provides a good framework for proving type sound- 
ness results [WF91]. In this approach, the objects of the dynamic semantics are the 
syntactic terms in Exp. 

The meaning of the function constants is given by the partial function 
6 : FConst x Val° — > Val° 

Since a closed value v £ Val° can have free channel names in it, we require, that if 
b £ FConst and S(b,v) is defined, then 

FCN(<5(6,v)) C FCN(v) 

In other words, 6 is not allowed to introduce new channel names. For the standard 
built-in function constants, the meaning of 6 is the expected one. For example: 

6 (+, ( 0 . 1 )) = 1 
6{+, (1.1)) = 2 
6{i st, (iq ,v 2 )) = iq 
6 (snd, (iq ,v 2 )) = v 2 

The meaning of 6 is straightforward for most of the event-valued combinators and 
constructors: 

6 (never, ()) = A 
<5(transmit, ( k.v )) = k\v 
<5( receive, k) = /t? 

<5(wrap, (ev .v)) = ( ev =>• v) 

S( choose, (evi . ev 2 )) = ( ev\ © ev 2 ) 

<5(wrapAbort, (ev.v)) = (ev | v) 

The only complication arises in the case of guarded-event values: 

6 (guard, v) = (G ( v ( ) ) ) 

<5(wrap, ((Ge).«)) = (G (wrap (e . v) )) 

^(choose, ((Gei).(Ge 2 ))) = (G (choose (ei .e 2 ) )) 

^(choose, ((Gei) . ev 2 )) = (G (choose (ei . ev 2 ))) 

^(choose, ( ev\ .(Ge 2 ))) = (G (choose ( ev\ .e 2 ))) 

<5(wrapAbort, ((Ge).v)) = (G (wrapAbort (e . v) )) 



These rules reflect guard’s role as a delay operator; when another event constructor 
is applied to a guarded event value, then the guard operator (G) is pulled out to 
delay the event construction. 

An evaluation context is a single-hole context where the hole marks the next redex 
(or is at the top if the term is irreducible) The evaluation of X cv is call-by-value and 
left-to- right, which leads to the following grammar for the evaluation contexts of 
X cv . 

E ::=[] \ E e \ v E \ (E.e) | ( v.E ) 

| let x = E in e | spawn E | sync E 

The sequential evaluation relation is defined in terms of these contexts: 


Definition 1 (i — >). The sequential evaluation relation , written “i — is the small- 
est relation satisfying the following four rules: 


E[bv\ i — >E[6{b,v)] 

E[ Ax(e) u] i — > E[e[x i— > v 

E[let x = v in e] i — > E[e[x i— > v 
E[sync (Ge)] i — > E[sync e] 


( ^CV 

(A c „-/3) 
(Ac^ -lot ) 
(A c „ -guard) 


Note that the rule (A c „ -guard) forces the expression delayed by guard. As usual, 
i — >* is the transitive closure of i — >. The evaluation of the other new forms (e.g., 
spawn) is defined as part of the concurrent evaluation relation in Section 6.4. 


6.3 Event Matching 

The key concept in the semantics of concurrent evaluation is the notion of event 
matching, which captures the semantics of rendezvous and communication. Infor- 
mally, if two processes synchronize on matching events, then they can exchange 
values and continue evaluation. Before we can make this more formal, we need an 
auxiliary definition 

Definition 2. The abort action of an event value ev is an expression, which, when 
evaluated, spawns the abort wrappers of ev. The map 

AbortAct : Event — > Exp 

maps an event value to its abort action, and is defined inductively as follows: 

AbortAct (A) = () 

AbortAct (k?) = () 

AbortAct(/t!v) = () 

AbortAct (ev =>• e) = AbortAct (ev) 

AbortAct (evi © er^) = ( Abort Act (evi) ; Abort Act (er^)) 

AbortAct(er) | v) = ( AbortAct ( ev ) ; spawn v) 


With this definition we can define the matching of event values formally: 



Definition3 (Event matching). The matching of event values is defined as a 
family of binary symmetric relations (indexed by Ch). For k £ Ch, define 

K 

evi C ev 2 with (ei,e 2 ) 

(pronounced “ev\ matches ev^ on channel k with respective results e\ and 62 ) as the 
smallest relation satisfying the six inference rules given in Figure 6. This relation is 

K 

abbreviated to evi C ei >2 when the results are unimportant. 


k\v C with ( () , v) 

K 

ev\ C CV2 with (ei,e2) 

K 

ev2 C ev\ with (e2,ei) 

K 

ev\ C ev2 with (ei,e2) 

K 

ev\ C {ev2 =>■ v ) with (ei,v 62) 

K 

ev\ C ev2 with (ei,e2) 

K 

ev\ C {ev2 © ev3) with (ei , (AbortAct(ev3) ; 62) ) 

K 

ev\ C ev2 with (ei,e2) 

K 

ev\ C (ev3 © ev2) with (ei , (AbortAct(ev3) ; 62) ) 

K 

ev\ C CV2 with (ei,e2) 

K 

ev\ C {ev2 | v ) with (ei , 62) 

Fig. 6. Rules for event matching 

An example of event matching is: 

(k? =>• As ((x .x))) C (/t!l7© (k? =>• A* ())) with (A* ((x .x)) 17, ()) 

Informally, if two processes attempt to synchronize on matching event values, then 
we can replace the applications of sync with the respective results. This is made 
more precise in the next section where the concurrent evaluation relation is defined. 

Note that event matching is nondeterministic; for example, both 
k? O (k'.17 © k'.29) with (17, ()) 



and 


k? C (/t!l7 © k\29) with (29, ()) 

It is also worth noting that even if one of the wrappers of an event value is non- 
terminating, the necessary abort actions for that event will be executed (assuming 
fair evaluation). This property is important because a common CML idiom is to have 
tail-recursive calls in wrappers (e.g., the buffered channel abstraction in Figure 3). 


6.4 Concurrent Evaluation 

Concurrent evaluation is defined as a transition system between finite sets of pro- 
cess states. This is similar to the style of the “Chemical Abstract Machine” [BB90], 
except that there are no “cooling” and “heating” transitions (the process sets of 
this semantics can be thought of as perpetually “hot” solutions). The concurrent 
evaluation relation extends “i — >” to finite sets of terms (i.e. , processes) and adds 
additional rules for process creation, channel creation, and communication. We as- 
sume a set of process identifiers, and define the set of processes and process sets 
as: 

7 r £ ProcId process IDs 

p = ( 7 r; e) £ Proc = (ProcId x Exp) processes 

V £ Fin(PROc) process sets 

We often write a process as ( 7 r; E[e\), where the evaluation context serves the role 
of the program counter, marking the current state of evaluation. 

Definition 4. A process set V is well- formed if for all (7r; e) £ V the following hold: 

— FV(e) = 0 (e is closed), and 

— there is no e' e, such that ( 7 r; e') £ V . 

It is occasionally useful to view well- formed process sets as finite maps from ProcId 
to Exp. If V is a finite set of process states and K. is a finite set of channel names, 
then 1C,V is a configuration. 

Definitions. A configuration K,,V is well-formed, if FCN(P) C K, and V is well- 
formed. 

The concurrent evaluation relation “^=>” extends “ 1 — >” to configurations, with 
additional rules for the concurrency operations. It is defined by four inference rules 
that define single step evaluations. Each concurrent evaluation step affects one or 
two processes, called the selected processes. I first describe each of these rules inde- 
pendently, and then state the formal definition. In stating these rules, we use the 
notation 5+* for S U {*}. 

The first rule extends the sequential evaluation relation ( 1 — >) to configurations: 

e 1 — ► e' 

K,P+(i r; e)=>K,T>+( tt; e') (A c ,-^) 



The selected process is 7r. 


The creation of channels requires picking a new channel name and substituting 
for the variable bound to it: 


k K 

K,, V+(tt, _E[chan s in e]) =>■ K,+n, V+(tt, _E7[e[m i— > «]]) 

Again, 7r is the selected process. 

Process creation requires picking a new process identifier: 

7r' 0 dom('P)+7r 

K.,V+(ir; .E [spawn t>]) =>• K, V+(ir; £ ; [()]) + (7r'; v ()) 


(A c „-chan) 


(A c „ -spawn) 


This rule has two selected processes: 7r and 7r'. 

The most interesting rule describes communication and synchronization. If two 
processes are attempting synchronization on matching events, then they may ren- 
dezvous — i.e., exchange a message and continue evaluation: 

K 

evi C ei >2 with (ei,e 2 ) 

1C,V+(iri; £^i[sync eDi]) + (ir 2 ; ^ 2 [sync ev 2 ]) (A c *-sync) 

=$• K.,V+(iri; E 1 [e 1 ]) + (-K 2 ; E 2 [e 2 ]) 

The selected processes for this rule are nq and ir 2 . We say that k is used in this 
transition. 


More formally, concurrent evaluation is defined as follows: 


Definition6 (^=>). The concurrent evaluation relation , written “=$■” is the small- 
est relation satisfying the rules: (A c „-i— >), (A c „-chan), (A„-spawn), and (A c „-sync). 

Under these rules, processes live forever; i.e., if a process evaluates to a value, 
it will never again be selected, but it remains in the process set. We could add the 
following rule, which is similar to the evaporation rule of [BB90]: 

A :,V+(ir, [v])=>K,V 

This rule is not included because certain results are easier to state and prove if the 
process set is monotonicly increasing. 

The following evaluation illustrates the concurrent evaluation relation: 

{}, {(7T 0 ; [chan k in (spawn As (sync (transmit (k .5))) ; sync (receive k) )])} 
=>■ {/co},{(ir 0 ; ([spawn As (sync (transmit (ko. 5)))]; sync (receive /to) ))} 
=>■ {ko},{(tto; ((); sync ([receive k 0 ]) )), 

(•nq; [As(sync (transmit (/to-5))) ()])} 

=>■ { K o},{(7ro; ((); sync ([receive k 0 ]) )), (nq; sync ([transmit (k 0 .5)]))} 
=^* { K o},{(7To; ((); [sync (k 0 ?)])), (nq; [sync (k 0 !5)])} 

=► {ko},{(tt 0 ; [((); 5)]), (tti; [()])} 

=>- {ko},{(7t 0 ; [5]), <tti; [()])} 



Note that this is only one of several possible evaluation sequences, although, in this 
example, all evaluation sequences produce the same result. 


6.5 Traces 

Because of the non-deterministic semantics of =>•, a X cv program can have many 
(often infinitely many) different evaluations. Furthermore, there are many interest- 
ing programs that do not terminate. Thus some new terminology and notation for 
describing evaluation sequences is required. This is used to state type soundness 
results for X cv in the next section. 

First we note the following properties of =>•: 

Lemma 7. If1C,V is well-formed and 1C,V =>• K,' ,V' then the following hold: 

1. K, 1 ,V is well-formed 

2. K C K' 

3. dom('P) C dom('P , ) 

Proof. By examination of the rules for =>•. 

Corollary 8. The properties of Lemma 7 hold for =>* . 

Proof. By induction on the length of the evaluation sequence. 

Note that property (1) implies that evaluation preserves closed terms. 

Definition 9. A trace T is a (possibly infinite) sequence of well-formed configura- 
tions 

T={{JCo,'Po; JCi.Pi; ...» 

such that K.i,Vi =>• /C;_|_i, "P;+i (for i < n, if T is finite with length n). The head of 
T is /Cq, Vq. 


Note that if a configuration /Co,7 ? o is well-formed, then any sequence of evaluation 
steps starting with /Co,7 ? o is a trace (by Corollary 8). 

The possible states of a process with respect to a configuration are given by the 
following definition. 

Definition 10. Let T be a well-formed process set and let p £ V, with p = (ir; e). 
The state of 7r in T is either zombie, blocked, or ready, depending on the form of e: 

— if e = [u], then p is a zombie, 

— if e = _E[sync ev\ and there does not exist a (7r'; £^'[sync ev'\) £ (fP \ {p}), such 

K 

that ev C ev' , then tv is blocked in V . 



otherwise, 7r is ready in V . 


We define the set of ready processes in V by 

Rdy("P) = {7r | 7r is ready in V} 

A configuration K.,V is terminal if Rdy("P) = 0. A terminal configuration with 
blocked processes is said to be deadlocked. 

Definition 11. A trace is a computation if it is maximal; i.e. , if it is infinite or if it 
is finite and ends in a terminal configuration. If e is a program, then we define the 
computations of e to be 

Comp(e) = { T\T is a computation with head (7To; e)} 

Note, I follow the convention of using wo as the process identifier of the initial process 
in a computation of a program. 

Definition 12. The set of processes of a trace T is defined as 

Procs(T) = {"7T | 3 TCi,Vi £ T with tv £ dom('Pi)} 

Since a given program can evaluate in different ways, the sequential notions 
of convergence and divergence are inadequate. Instead, we define convergence and 
divergence relative to a particular computation of a program. 

Definition 13. A process 7r £ Procs(T) converges to a value v in T, written it\f T v, 
if K., "P+(7r; v) £ T. We say that 7r diverges in T, written 7rf|' T , if for every K,V £ T, 
with 7r £ dom('P), 7r is ready or blocked in V . 

Divergence includes deadlocked processes and terminating processes that are not 
evaluated often enough to reach termination, as well as those with infinite loops. 
It does not include processes with run-time type errors, which are called stuck (see 
Section 7.3). 


7 Static Semantics 

It is well known that references (i.e., updatable memory cells) can be coded using 
channels and processes [GMP89, Rep91b, BMT92]. This fact makes it apparent that 
polymorphic channels incur the same typing problems as polymorphic references 
(see [Tof90] for a good description of these problems). It is possible, however, to 
give a sound typing to X cv programs using the imperative type scheme of SML 
[MTH90, Tof90]. In this section, I present a type system for X cv programs, and state 
soundness results about the system. The sound typing of X cv is discussed in greater 
detail (including a proof of the soundness of the type system) in [Rep92]. 



The presentation uses standard notation (e.g., see [Tof90] or [WF91]). Let i £ 
TyCon = {int, bool, . . .} designate the type constants. Type variables are parti- 
tioned into two sets: 

u £ ImpTyVar imperative type variables 

t £ AppTyVar applicative type variables 

a,/3 £ TyVar = ImpTyVar U AppTyVar type variables 

The set of types, r £ Ty, is defined by 

r ::= t type constants 

| a type variables 

| (ti — > T 2 ) function types 
I ( r i x T 2 ) pair types 
| r chan channel types 
| r event event types 

and the set of type schemes, a £ TyScheme, are defined by 

a ::= r 
| Va.cr 

We write Vaq ■ ■ - a n .T for the type scheme a = Vaq • • -Va„.r, and write FTV(cr) for 
the free type variables of a. We define the set of imperative types by 

9 £ ImpTy = {t I FTV(r) c ImpTyVar} 

Note that all of the free type variables in an imperative type are imperative. 

Type environments assign type schemes to variables in terms. Since we are in- 
terested in assigning types to intermediate stages of evaluation, channel names also 
need to be assigned types. Therefore, a typing environment is a pair of finite maps: 
a variable typing and a channel typing: 

VT £ VarTy = Var TyScheme 

CT e ChanTy = Ch ^ ImpTy 
TE = (VT, CT) £ TyEnv = (VarTy x ChanTy) 

We use FTV(VT) and FTV(CT) to denote the sets of free type variables of variable 
and channel typings, and 

FTV(TE) = FTV(VT) U FTV(CT) 

where TE = (VT, CT). Note that there are no bound type variables in a channel 
typing, and that FTV(CT) C ImpTyVar. The following shorthand is useful for type 
environment modification: 

TE ± 1 — > < 7 } =def (VT ± 1 — > cr}, CT) 

TE ±{kh 9} = def (VT, CT ±{kh 0}) 

where x £ Var, k £ Ch, and TE = (VT, CT). 

Because of the need to preserve imperative types, we require that substitutions 
map imperative type variables to imperative types. As before, we allow substitutions 
to be applied to types and type environments. 



Definitionl4. A type r' is an instance of a type scheme a = V«i • • • a n .r , written 
a y t ' , if there exists a finite substitution, S, with dom(S') = {oq, . . . , and 
St = t' . If <t y t' , then we say that cr is a generalization of r'. We say that a y a' 
if whenever a' y r, then a y t. 

Definition 15. The closure of a type r with respect to a type environment TE is 
defined as: CloSte(t) = Vai • • • a n .r , where 

{a u ...,a n } = FTV(r) \ FTV(TE) 

And the applicative closure of r is defined as: AppCloSte(t) = Vai • • • a n .r , where 
{ai, ...,a n } = (FTV(r) \ FTV(TE)) n AppTyVar 

7.1 Expression Typing Rules 

To associate types with the constants, we assume the existence of a function 

TypeOf : Const — > TyScheme 

For the concurrency related constants, it assigns the following type schemes: 

never : Va.(unit — > a event) 

receive : Va.(a chan — > a event) 

transmit : Va.((a chan x a) — > unit event) 

wrap : Va/3.((a event x (a — > (3 )) — > (3 event) 

choose : Va.((a event x a event) — > a event) 

guard : Va.((unit — > a event) — > a event) 

wrapAbort : Va.((a event x (unit — > unit)) — > a event) 

We also assume that there are no event-valued constants. More formally, we require 
that there does not exist any b such that TypeOf(6) = r event, for some type r. 

The typing rules for X cv are divided into two groups. The core rules are given 
in Figure 7. There are two rules for let: the rule (r-app-let) applies in the non- 
expansive case (in the syntax of X cv , this is when the bound expression is in Val); 
the rule (r-imp-let) applies when the expression is expansive (not a value). There 
are also rules for typing channel names, and pair expressions. The rule (r-chan) 
restricts the type of the introduced channel to be imperative. In addition to these 
core typing rules, there are rules for the other syntactic forms (see Figure 8). Given 
the appropriate environment, these rules can be derived from rule (r-app) (rule (r- 
const) in the case of A). It is useful, however, to include them explicitly. As before, it 
is worth noting that the syntactic form of a term uniquely determines which typing 
rule applies. 

In order that the typing of constants be sensible, we impose a typability restric- 
tion on the definitions of 8 and TypeOf. If TypeOf(h) y (r' — > r) and TE b v : t' , 
then 8{b,v) is defined and TE b 8{b,v) : r. It is worth noting that the 6 rules we 
defined for the concurrency constants respect this restriction. 



TypeOf(fe) >- r 
TE b b'.T 

(r-const) 

x G dom(VT) VT(i) >- r 
(VT,CT) b x :t 

(r-var) 

CT(«) = e 
(VT,CT) b k : 9 

(r-chvar) 

TE b ei : (r' r) TE b e 2 : r' 
TE b ei e 2 : r 

(r-app) 

TE ± {i h t} b e : r 1 
TE b Xx (e) : (r -► r') 

(r-abs) 

TE b ei : n TE b e 2 : r 2 
TE b (ei .e 2 ) : (n x r 2 ) 

(r-pair) 

TE bn:/ TE ± {e h- > CLOSxE(' ^, )} b e : r 
TE b let x = v in e : t 

(r-app-let) 

TE b ei : t' TE ± {x i— > AppCLOSxE(' ^, )} b e 2 : r 
TE b let x = ei in e 2 : r 

(r-imp-let) 

TE ± {e i— > 0 chan} b e : r 
TE b chan x in e : r 

(r-chan) 


Fig. 7. Core type inference rules for Xcv 
7.2 Process Typings 

A process typing is a finite map from process identifiers to types: 

PT e ProcTy = ProcId ^ Ty 

Typing judgements are extended to process configurations by the following definition. 

Definition 16. A well-formed configuration K.,V has type PT under a channel typ- 
ing CT, written 

CT b K,V : PT 

if the following hold: 

— K C dom(CT), 

— dom('P) C dom(PT), and 

— for every (ir; e) G V , ({}, CT) b e : PT(7 t). 



TE b e : (unit — > r) 
TE b spawn e : unit 

(r-spawn) 

TE b e : r event 
TE b sync e : r 

(r-sync) 

TE b e : r event 
TE b (G e) : r event 

(r-guard) 

Va.a event >- r 
TE bi:T 

(r-never) 

TE b k : r chan TE b v : r 
TE b k\v : unit event 

(r-output) 

TE b k : r chan 
TE b /c? : r event 

(r-input) 

TE b ev : t' event TE b e : (r 1 — > r) 
TE b [ev =>• e) : r event 

(r-wrap) 

TE b ev\ : r event TE b ev2 : r event 
TE b (evi ® ev2) : r event 

(r-choice) 

TE b ev : r event TE b v : (unit — > unit) 
TE b [ev | v) : r event 

(r-abort) 


Fig. 8. Other type inference rules for A c „ 


For CML, where spawn requires a (unit — > unit) argument, the process typing is 
PT(tt) = unit for all tv £ dom('P). 


7.3 Type Soundness 

This section presents a statement of the soundness of the above type system with 
respect to the dynamic semantics of Section 6. To prove these results, I use the “syn- 
tactic” approach of Wright and Felleisen [WF91] (see [Rep92] for the proofs). The 
basic idea is to show that evaluation preserves types (also called subject reduction)-, 
then characterize run-time type errors (called “ stuck states ”) and show that stuck 
states are untypable. This allows us to conclude that well-typed programs cannot 
go wrong. 

The first step in this process is to show that the sequential evaluation relation 
preserves the types of expressions: 

Theorem 17 Sequential type preservation. For any type environment TE, ex- 
pression e\ and type t, such that TE b e\ : t, if e\ i — > e 2 then TE b e^ : r. 



This result is then extended to concurrent evaluation and process typings. 


Theorem 18 (Concurrent type preservation). If a configuration 1C,V is well- 
formed with 

K,,V^ K',V' 

and, for some channel typing CT, 

CT b K,V : PT 

Then there is a channel typing CT* and a process typing PT 7 , such that the following 
hold: 

- CT C CT', 

- PT C PT ( , and 

- CT' h K',V : PT'. 

- CT' h K,V : PT'. 

With these results, it is fairly easy to show soundness results: 

Theorem 19 (Syntactic soundness). Let e be a program, with he:r. Then, for 
any T £ Comp(e), 7r £ Procs (T), with the first occurrence of it in T, there 

exists a CT and PT, such that 


CT h Ki,Vi : PT 


and PT(tto) = r. And either 

- 7rfr T , or 

— 7rJJ. T n and there exists an extension CT ( of CT with ({}, CT ( ) h v : PT(7t). 

Theorem 20 (Soundness). If e is a program with h e : r, then for any computa- 
tion T £ Comp(e) and any process ID it £ Procs(T), the following hold: 

(Strong soundness) //eval T (7r) = v, and is the first occurrence of it in T, 

then for any CT and PT, such that CT h TCi,Vi : PT and PT(7To) = r, there is 
an extension CT ( of CT, such that ({}, CT ( ) h v : PT(7t). 

(Weak soundness) eval T (7r) WRONG 


8 Related Work 

There are many approaches to concurrent language design (see either [AS83] or 
[And91] for an overview); our approach is an offshoot of the CSP-school of con- 
current language design. CML began as a reimplementation of the concurrency 
primitives of PML [Rep88] in SML/NJ, but has evolved into a significantly more 



powerful language. PML in turn was heavily influenced by amber [Car86]. There 
have been other attempts at adding concurrency to various versions of ML. Most of 
these have been based on message passing ([Hol83], [Mat89], and [Ram90] for exam- 
ple), but there is at least one shared memory approach [CM90]. As we have shown in 
this paper, message passing fits very nicely into SML. It allows an applicative style 
of programming to be used most of the time; the state modifying operations are 
hidden in the thread and channel abstractions. CML extends the message passing 
paradigm by making synchronous operations first-class, which provides a mechanism 
for building user-defined synchronization abstractions. 

Using concurrency to implement interactive systems has been proposed and im- 
plemented by several people. In [RG86], we made the argument that concurrency is 
vital for the construction of interactive programming environments. Pike (in [Pik89] ) 
and Haahr (in [Haa90]) describe experimental window systems built out of threads 
and channels, but neither of these were fast enough for real use. 

The semantics of Facile have been proposed as a model for PML in [PGM90], 
but no translation from PML to Facile is given. Independent work by Berry, Milner 
and Turner at the University of Edinburgh has resulted in an operational semantics 
for a small concurrent language, which includes the PML version of events [BMT92]. 
The semantics presented in this paper has some strong similarities with that of 
[BMT92], but X cv is a richer language; in particular, the language of [BMT92] does 
not include the guard and wrapAbort event value constructors found in CML. An 
earlier version of the semantics of first-class synchronous operations was presented in 
[Rep91b], and a more complete treatment of the version presented here can be found 
in [Rep92] (including extensions to cover features such as polling and exceptions). 


9 Conclusions 


We have described a higher-order concurrent language, CML, its use in real-world 
applications, and its formal semantics. CML supports first-class synchronous oper- 
ations, which provide a powerful mechanism for communication and synchronization 
abstraction. Our experience with CML “in-the-field” and our measurements of the 
performance of the implementation show that CML is a practical tool for building 
real systems. We feel that CML is unique in that it combines a flexible high-level 
notation with good performance. In addition, it is well-defined, with a formal se- 
mantics. 

CML is a stable system, and is freely available for distribution. The latest ver- 
sion of both CML and its manual are available via anonymous ftp in the /pub 
directory on ftp.cs.cornell.edu; for more information send electronic mail to 
cml-bugs@cs.cornell.edu. In addition, both CML and eXene will be included 
as part of the SML/NJ system (most likely in the fall of 1992). 
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